[core][apps] Implemented the socket close reason feature #2747
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
POC for the "close reason" feature.
Close reason is a code that is set to record, who was responsible for breaking the connection and closing the socket.
This reason is "one shot", as it's always set to the unset value all the time and also during connection, until the connection was broken or the socket was closed. As many things like this may happen even multiple times on one socket, only first such activity counts, and every next one will not override a once set value.
In general, this code can be set from the following sources:
If the connection has been broken because of a received
UMSG_SHUTDOWN
message, it should also contain the reason code in this message. If this succeeds, this is recorded on the side that received this message as a peer code and the agent will contain the "peer" redirection information. If it was the agent that has first broken the connection or closed the socket, the appropriate code will be in the agent code location and the peer code is unset.API changes:
SRT_CLOSE_INFO
structure provides the close reason code information:agent
: code for agent, if it was agent to break the socket, otherwise it containsSRT_CLS_PEER
.peer
: code for peer extracted fromUMSG_SHUTDOWN
message, orSRT_CLS_FALLBACK
if the peer doesn't support ittime
: time when it happened, in the same convention assrt_time_now()
.srt_close_getreason(socket, close_info_ptr)
to retrieve the close reason for a socket. Note that it doesn't matter if the socket was calledsrt_close()
before this call, although you should not call it unless you know that there was a transmission error on a socket or at least that the connection was broken.srt_close_withreason(socket, reason_code)
to close the socket and set the user reason. Note that only values ofSRT_CLSC_USER
or greater are accepted, otherwise it behaves just likesrt_close()
.SRT_CLOSE_REASON
is filled with reason codes. The close reason codes are designed to match the rejection codes at least in some common part, but only about a half of them is similar in both.Protocol changes:
The only protocol change is that the
UMSG_SHUTDOWN
message now carries additional data, which is only the 32-bit rejection code value. Note that all messages that didn't carry extra data were filled by zero-padding for 4 bytes, as the UDT author explained, due to the requirement ofwritev
(and likely latersendmsg
) that didn't accept zero-length block declarations in multi-block calls. This way, so farUMSG_SHUTDOWN
was actually sending a 4-byte body filled with value 0. This matches theSRT_CLS_UNKNOWN
code. So, for compatibility:UMSG_SHUTDOWN
to the new peer sends theSRT_CLS_UNKNOWN
value, which is then fixed intoSRT_CLS_FALLBACK
, which means that it was shut down by the other party, but that party doesn't support the feature.UMSG_SHUTDOWN
to the old peer is not a problem because older versions do not read any body data from this message.Nuisances:
.agent == SRT_CLS_PEER
and this party's reason code in.peer
. It may always happen that both sides break the connection at the same time independently and in this case both sides could show as if they were responsible. This is also the reason why this close reason field cannot be just one for all purposes, or why the rejection reason functionality cannot be at least partially reused.UMSG_SHUTDOWN
message is volatile and not under journal protection (as it's with handshake or ACK). If the UDP packet carrying this message was lost in the network, the party that should receive it will not learn about it and likely in this case it will have.agent == SRT_CLS_PEERIDLE
. This means that even if there is a message passing possible in order to keep the blame on the peer, this is not guaranteed to work in every case.The recommendation for applications: always extract the close reason code and provide the user with this information, and the users should retrieve them on both connection parties to ensure the maximum of retrievable information. This information is highly volatile and there might potentially exist multiple reasons why the socket was closed or the connection was broken, and this functionality provides the very first of them.
The testing application
srt-test-live
is armed with displaying this code and it also uses two user codes: 0 (transmission interrupted) and 1 (configuration error).